[TableGen][DecoderEmitter] Rework table construction/emission #155889

s-barannikov · 2025-08-28T17:14:39Z

Current state

We have FilterChooser class, which can be thought of as a tree of encodings. Tree nodes are instances of FilterChooser itself, and come in two types:

A node containing single encoding that has constant bits in the specified bit range, a.k.a. singleton node.
A node containing only child nodes, where each child represents a set of encodings that have the same constant bits in the specified bit range.

Either of these nodes can have an additional child, which represents a set of encodings that have some unknown bits in the same bit range.

As can be seen, the data structure is very high level.

The encoding tree represented by FilterChooser is then converted into a finite-state machine (FSM), represented as byte array. The translation is straightforward: for each node of the tree we emit a sequence of opcodes that check encoding bits and predicates for each encoding. For a singleton node we also emit a terminal "decode" opcode.

The translation is done in one go, and this has negative consequences:

We miss optimization opportunities.
We have to use "fixups" when encoding transitions in the FSM since we don't know the size of the data we want to jump over in advance. We have to emit the data first and then fix up the location of the jump. This means the fixup size has to be large enough to encode the longest jump, so most of the transitions are encoded inefficiently.
Finally, when converting the FSM into human readable form, we have to decode the byte array we've just emitted. This is also done in one go, so we can't do any pretty printing.

This PR

We introduce an intermediary data structure, decoder tree, that can be thought as AST of the decoder program.
This data structure is low level and as such allows for optimization and analysis.
It resolves all the issues listed above. We now can:

Emit more optimal opcode sequences.
Compute the size of the data to be emitted in advance, avoiding fixups.
Do pretty printing.

Serialization is done by a new class, DecoderTableEmitter, which converts the AST into a FSM in textual form, streamed right into the output file.

Results

The new approach immediately resulted in 12% total table size savings across all in-tree targets, without implementing any optimizations on the AST. Many tables observe ~20% size reduction.
The generated file is much more readable.
The implementation is arguably simpler and more straightforward (the diff is only +150~200 lines, which feels rather small for the benefits the change gives).

jurahul · 2025-08-28T17:32:06Z

The output you showed in https://discourse.llvm.org/t/rfc-a-new-way-to-resolve-decoding-conflicts-in-llvm-s-decoder/88104/8 looks good. I have been thinking of something like a switch op as well to replace the linear search logic that we currently encode in the table. Thanks!

github-actions · 2025-09-13T10:44:48Z

✅ With the latest revision this PR passed the C/C++ code formatter.

…159113) Extracted from #155889, which removes inclusion of `MCDecoderOps.h`.

s-barannikov · 2025-09-16T18:05:56Z

This is more or less ready for review.
(I'll need to write some description and add some comments.)

llvmbot · 2025-09-16T18:12:42Z

@llvm/pr-subscribers-llvm-mc

@llvm/pr-subscribers-tablegen

Author: Sergei Barannikov (s-barannikov)

Changes

Patch is 82.12 KiB, truncated to 20.00 KiB below, full version: https://github.com/llvm/llvm-project/pull/155889.diff

14 Files Affected:

(modified) llvm/include/llvm/MC/MCDecoderOps.h (+8-12)
(modified) llvm/test/TableGen/DecoderEmitter/DecoderEmitterBitwidthSpecialization.td (+22-22)
(modified) llvm/test/TableGen/DecoderEmitter/VarLenDecoder.td (+12-16)
(modified) llvm/test/TableGen/DecoderEmitter/additional-encoding.td (+32-22)
(modified) llvm/test/TableGen/DecoderEmitter/big-filter.td (+12-8)
(modified) llvm/test/TableGen/DecoderEmitter/trydecode-emission.td (+7-24)
(modified) llvm/test/TableGen/DecoderEmitter/trydecode-emission2.td (+10-20)
(modified) llvm/test/TableGen/DecoderEmitter/trydecode-emission3.td (+7-15)
(modified) llvm/test/TableGen/DecoderEmitter/trydecode-emission4.td (+7-16)
(modified) llvm/test/TableGen/DecoderEmitter/var-len-conflict-1.td (+6-13)
(modified) llvm/test/TableGen/HwModeEncodeDecode.td (+5-5)
(modified) llvm/test/TableGen/HwModeEncodeDecode2.td (+14-14)
(modified) llvm/test/TableGen/HwModeEncodeDecode3.td (+34-34)
(modified) llvm/utils/TableGen/DecoderEmitter.cpp (+662-517)

diff --git a/llvm/include/llvm/MC/MCDecoderOps.h b/llvm/include/llvm/MC/MCDecoderOps.h
index 790ff3eb4f333..4e06deb0eacee 100644
--- a/llvm/include/llvm/MC/MCDecoderOps.h
+++ b/llvm/include/llvm/MC/MCDecoderOps.h
@@ -13,19 +13,15 @@
 namespace llvm::MCD {
 
 // Disassembler state machine opcodes.
-// nts_t is either uint16_t or uint24_t based on whether large decoder table is
-// enabled.
 enum DecoderOps {
-  OPC_Scope = 1,         // OPC_Scope(nts_t NumToSkip)
-  OPC_ExtractField,      // OPC_ExtractField(uleb128 Start, uint8_t Len)
-  OPC_FilterValueOrSkip, // OPC_FilterValueOrSkip(uleb128 Val, nts_t NumToSkip)
-  OPC_FilterValue,       // OPC_FilterValue(uleb128 Val)
-  OPC_CheckField,        // OPC_CheckField(uleb128 Start, uint8_t Len,
-                         //                uleb128 Val)
-  OPC_CheckPredicate,    // OPC_CheckPredicate(uleb128 PIdx)
-  OPC_Decode,            // OPC_Decode(uleb128 Opcode, uleb128 DIdx)
-  OPC_TryDecode,         // OPC_TryDecode(uleb128 Opcode, uleb128 DIdx)
-  OPC_SoftFail,          // OPC_SoftFail(uleb128 PMask, uleb128 NMask)
+  OPC_Scope = 1,      // OPC_Scope(uleb128 Size)
+  OPC_SwitchField,    // OPC_SwitchField(uleb128 Start, uint8_t Len,
+                      //                 [uleb128 Val, uleb128 Size]...)
+  OPC_CheckField,     // OPC_CheckField(uleb128 Start, uint8_t Len, uleb128 Val)
+  OPC_CheckPredicate, // OPC_CheckPredicate(uleb128 PIdx)
+  OPC_Decode,         // OPC_Decode(uleb128 Opcode, uleb128 DIdx)
+  OPC_TryDecode,      // OPC_TryDecode(uleb128 Opcode, uleb128 DIdx)
+  OPC_SoftFail,       // OPC_SoftFail(uleb128 PMask, uleb128 NMask)
 };
 
 } // namespace llvm::MCD
diff --git a/llvm/test/TableGen/DecoderEmitter/DecoderEmitterBitwidthSpecialization.td b/llvm/test/TableGen/DecoderEmitter/DecoderEmitterBitwidthSpecialization.td
index 71b0c99675baa..5f3e2a8d2d7df 100644
--- a/llvm/test/TableGen/DecoderEmitter/DecoderEmitterBitwidthSpecialization.td
+++ b/llvm/test/TableGen/DecoderEmitter/DecoderEmitterBitwidthSpecialization.td
@@ -84,14 +84,14 @@ def Inst3 : Instruction16Bit<3> {
 // In the default case, we emit a single decodeToMCinst function and DecodeIdx
 // is shared across all bitwidths.
 
-// CHECK-DEFAULT-LABEL: DecoderTable8[25]
-// CHECK-DEFAULT:         DecodeIdx: 0
-// CHECK-DEFAULT:         DecodeIdx: 1
+// CHECK-DEFAULT-LABEL: DecoderTable8
+// CHECK-DEFAULT:         using decoder 0
+// CHECK-DEFAULT:         using decoder 1
 // CHECK-DEFAULT:       };
 
-// CHECK-DEFAULT-LABEL: DecoderTable16[25]
-// CHECK-DEFAULT:         DecodeIdx: 2
-// CHECK-DEFAULT:         DecodeIdx: 3
+// CHECK-DEFAULT-LABEL: DecoderTable16
+// CHECK-DEFAULT:         using decoder 2
+// CHECK-DEFAULT:         using decoder 3
 // CHECK-DEFAULT:       };
 
 // CHECK-DEFAULT-LABEL: template <typename InsnType>
@@ -105,10 +105,10 @@ def Inst3 : Instruction16Bit<3> {
 // When we specialize per bitwidth, we emit 2 decodeToMCInst functions and
 // DecodeIdx is assigned per bit width.
 
-// CHECK-SPECIALIZE-NO-TABLE-LABEL:   DecoderTable8[26]
-// CHECK-SPECIALIZE-NO-TABLE:           /* 0 */ 8, // Bitwidth 8
-// CHECK-SPECIALIZE-NO-TABLE:           DecodeIdx: 0
-// CHECK-SPECIALIZE-NO-TABLE:           DecodeIdx: 1
+// CHECK-SPECIALIZE-NO-TABLE-LABEL:   DecoderTable8
+// CHECK-SPECIALIZE-NO-TABLE:           8, // 0: BitWidth 8
+// CHECK-SPECIALIZE-NO-TABLE:           using decoder 0
+// CHECK-SPECIALIZE-NO-TABLE:           using decoder 1
 // CHECK-SPECIALIZE-NO-TABLE:         };
 
 // CHECK-SPECIALIZE-NO-TABLE-LABEL:   template <typename InsnType>
@@ -117,10 +117,10 @@ def Inst3 : Instruction16Bit<3> {
 // CHECK-SPECIALIZE-NO-TABLE:           case 0
 // CHECK-SPECIALIZE-NO-TABLE:           case 1
 
-// CHECK-SPECIALIZE-NO-TABLE-LABEL:   DecoderTable16[26]
-// CHECK-SPECIALIZE-NO-TABLE:           /* 0 */ 16, // Bitwidth 16
-// CHECK-SPECIALIZE-NO-TABLE:           DecodeIdx: 0
-// CHECK-SPECIALIZE-NO-TABLE:           DecodeIdx: 1
+// CHECK-SPECIALIZE-NO-TABLE-LABEL:   DecoderTable16
+// CHECK-SPECIALIZE-NO-TABLE:           16, // 0: BitWidth 16
+// CHECK-SPECIALIZE-NO-TABLE:           using decoder 0
+// CHECK-SPECIALIZE-NO-TABLE:           using decoder 1
 // CHECK-SPECIALIZE-NO-TABLE:         };
 
 // CHECK-SPECIALIZE-NO-TABLE-LABEL:   template <typename InsnType>
@@ -138,10 +138,10 @@ def Inst3 : Instruction16Bit<3> {
 // Per bitwidth specialization with function table.
 
 // 8 bit deccoder table, functions, and function table.
-// CHECK-SPECIALIZE-TABLE-LABEL:    DecoderTable8[26]
-// CHECK-SPECIALIZE-TABLE:           /* 0 */ 8, // Bitwidth 8
-// CHECK-SPECIALIZE-TABLE:            DecodeIdx: 0
-// CHECK-SPECIALIZE-TABLE:            DecodeIdx: 1
+// CHECK-SPECIALIZE-TABLE-LABEL:    DecoderTable8
+// CHECK-SPECIALIZE-TABLE:            8, // 0: BitWidth 8
+// CHECK-SPECIALIZE-TABLE:            using decoder 0
+// CHECK-SPECIALIZE-TABLE:            using decoder 1
 // CHECK-SPECIALIZE-TABLE:          };
 
 // CHECK-SPECIALIZE-TABLE-LABEL:    template <typename InsnType>
@@ -161,10 +161,10 @@ def Inst3 : Instruction16Bit<3> {
 // CHECK-SPECIALIZE-TABLE-NEXT:     };
 
 // 16 bit deccoder table, functions, and function table.
-// CHECK-SPECIALIZE-TABLE-LABEL:    DecoderTable16[26]
-// CHECK-SPECIALIZE-TABLE:            /* 0 */ 16, // Bitwidth 16
-// CHECK-SPECIALIZE-TABLE:            DecodeIdx: 0
-// CHECK-SPECIALIZE-TABLE:            DecodeIdx: 1
+// CHECK-SPECIALIZE-TABLE-LABEL:    DecoderTable16
+// CHECK-SPECIALIZE-TABLE:            16, // 0: BitWidth 16
+// CHECK-SPECIALIZE-TABLE:            using decoder 0
+// CHECK-SPECIALIZE-TABLE:            using decoder 1
 // CHECK-SPECIALIZE-TABLE:          };
 
 // CHECK-SPECIALIZE-TABLE-LABEL:    template <typename InsnType>
diff --git a/llvm/test/TableGen/DecoderEmitter/VarLenDecoder.td b/llvm/test/TableGen/DecoderEmitter/VarLenDecoder.td
index 0d913dc7587ed..0bead888d71d6 100644
--- a/llvm/test/TableGen/DecoderEmitter/VarLenDecoder.td
+++ b/llvm/test/TableGen/DecoderEmitter/VarLenDecoder.td
@@ -1,5 +1,4 @@
-// RUN: llvm-tblgen -gen-disassembler -I %p/../../../include %s | FileCheck %s --check-prefixes=CHECK,CHECK-SMALL
-// RUN: llvm-tblgen -gen-disassembler --large-decoder-table -I %p/../../../include %s | FileCheck %s --check-prefixes=CHECK,CHECK-LARGE
+// RUN: llvm-tblgen -gen-disassembler -I %p/../../../include %s | FileCheck %s --check-prefixes=CHECK
 
 include "llvm/Target/Target.td"
 
@@ -53,19 +52,16 @@ def FOO32 : MyVarInst<MemOp32> {
 // CHECK-NEXT: 43,
 // CHECK-NEXT: };
 
-// CHECK-SMALL:      /* 0 */       OPC_ExtractField, 3, 5,                // Field = Inst{7-3}
-// CHECK-SMALL-NEXT: /* 3 */       OPC_FilterValueOrSkip, 8, 4, 0,        // if Field != 0x8 skip to 11
-// CHECK-SMALL-NEXT: /* 7 */       OPC_Decode, {{[0-9]+}}, {{[0-9]+}}, 0, // Opcode: FOO16
-// CHECK-SMALL-NEXT: /* 11 */      OPC_FilterValue, 9,                    // if Field != 0x9 pop scope
-// CHECK-SMALL-NEXT: /* 13 */      OPC_Decode, {{[0-9]+}}, {{[0-9]+}}, 1, // Opcode: FOO32
-// CHECK-SMALL-NEXT: };
-
-// CHECK-LARGE:      /* 0 */       OPC_ExtractField, 3, 5,                // Field = Inst{7-3}
-// CHECK-LARGE-NEXT: /* 3 */       OPC_FilterValueOrSkip, 8, 4, 0, 0,     // if Field != 0x8 skip to 12
-// CHECK-LARGE-NEXT: /* 8 */       OPC_Decode, {{[0-9]+}}, {{[0-9]+}}, 0, // Opcode: FOO16
-// CHECK-LARGE-NEXT: /* 12 */      OPC_FilterValue, 9,                    // if Field != 0x9 pop scope
-// CHECK-LARGE-NEXT: /* 14 */      OPC_Decode, {{[0-9]+}}, {{[0-9]+}}, 1, // Opcode: FOO32
-// CHECK-LARGE-NEXT: };
+// CHECK-LABEL: static const uint8_t DecoderTable43[15] = {
+// CHECK-NEXT:    OPC_SwitchField, 3, 5,       //  0: switch Inst[7:3] {
+// CHECK-NEXT:    8, 4,                        //  3: case 0x8: {
+// CHECK-NEXT:    OPC_Decode, {{[0-9, ]+}}, 0, //  5:  decode to FOO16 using decoder 0
+// CHECK-NEXT:                                 //  5: }
+// CHECK-NEXT:    9, 0,                        //  9: case 0x9: {
+// CHECK-NEXT:    OPC_Decode, {{[0-9, ]+}}, 1, // 11:  decode to FOO32 using decoder 1
+// CHECK-NEXT:                                 // 11: }
+// CHECK-NEXT:                                 // 11: } // switch Inst[7:3]
+// CHECK-NEXT:  };
 
 // CHECK:      case 0:
 // CHECK-NEXT: tmp = fieldFromInstruction(insn, 8, 3);
@@ -86,7 +82,7 @@ def FOO32 : MyVarInst<MemOp32> {
 // CHECK-NEXT: MI.addOperand(MCOperand::createImm(tmp));
 // CHECK-NEXT: return S;
 
-// CHECK-LABEL: case OPC_ExtractField: {
+// CHECK-LABEL: case OPC_SwitchField: {
 // CHECK: makeUp(insn, Start + Len);
 
 // CHECK-LABEL: case OPC_CheckField: {
diff --git a/llvm/test/TableGen/DecoderEmitter/additional-encoding.td b/llvm/test/TableGen/DecoderEmitter/additional-encoding.td
index 0d4d3c096f83d..e60c9c35c8439 100644
--- a/llvm/test/TableGen/DecoderEmitter/additional-encoding.td
+++ b/llvm/test/TableGen/DecoderEmitter/additional-encoding.td
@@ -30,28 +30,38 @@ class I<dag out_ops, dag in_ops> : Instruction {
   let OutOperandList = out_ops;
 }
 
-// CHECK:      /* 0 */  OPC_ExtractField, 12, 4,               // Field = Inst{15-12}
-// CHECK-NEXT: /* 3 */  OPC_FilterValueOrSkip, 0, 15, 0,       // if Field != 0x0 skip to 22
-// CHECK-NEXT: /* 7 */  OPC_Scope, 8, 0,                       // end scope at 18
-// CHECK-NEXT: /* 10 */ OPC_CheckField, 6, 6, 0,               // if Inst{11-6} != 0x0
-// CHECK-NEXT: /* 14 */ OPC_Decode, {{[0-9]+}}, 2, 0,          // Opcode: {{.*}}:NOP, DecodeIdx: 0
-// CHECK-NEXT: /* 18 */ OPC_TryDecode, {{[0-9]+}}, 2, 1,       // Opcode: SHIFT0, DecodeIdx: 1
-// CHECK-NEXT: /* 22 */ OPC_FilterValueOrSkip, 1, 15, 0,       // if Field != 0x1 skip to 41
-// CHECK-NEXT: /* 26 */ OPC_Scope, 8, 0,                       // end scope at 37
-// CHECK-NEXT: /* 29 */ OPC_CheckField, 6, 6, 0,               // if Inst{11-6} != 0x0
-// CHECK-NEXT: /* 33 */ OPC_Decode, {{[0-9]+}}, 2, 0,          // Opcode: {{.*}}:NOP, DecodeIdx: 0
-// CHECK-NEXT: /* 37 */ OPC_TryDecode, {{[0-9]+}}, 2, 1,       // Opcode: SHIFT1, DecodeIdx: 1
-// CHECK-NEXT: /* 41 */ OPC_FilterValueOrSkip, 2, 15, 0,       // if Field != 0x2 skip to 60
-// CHECK-NEXT: /* 45 */ OPC_Scope, 8, 0,                       // end scope at 56
-// CHECK-NEXT: /* 48 */ OPC_CheckField, 6, 6, 0,               // if Inst{11-6} != 0x0
-// CHECK-NEXT: /* 52 */ OPC_Decode, {{[0-9]+}}, 2, 0,          // Opcode: {{.*}}:NOP, DecodeIdx: 0
-// CHECK-NEXT: /* 56 */ OPC_TryDecode, {{[0-9]+}}, 2, 1,       // Opcode: SHIFT2, DecodeIdx: 1
-// CHECK-NEXT: /* 60 */ OPC_FilterValue, 3,                    // if Field != 0x3
-// CHECK-NEXT: /* 62 */ OPC_Scope, 8, 0,                       // end scope at 73
-// CHECK-NEXT: /* 65 */ OPC_CheckField, 6, 6, 0,               // if Inst{11-6} != 0x0
-// CHECK-NEXT: /* 69 */ OPC_Decode, {{[0-9]+}}, 2, 0,          // Opcode: {{.*}}:NOP, DecodeIdx: 0
-// CHECK-NEXT: /* 73 */ OPC_TryDecode, {{[0-9]+}}, 2, 1,       // Opcode: SHIFT3, DecodeIdx: 1
-
+// CHECK-LABEL: static const uint8_t DecoderTable16[67] = {
+// CHECK-NEXT:    OPC_SwitchField, 12, 4,         //  0: switch Inst[15:12] {
+// CHECK-NEXT:    0, 14,                          //  3: case 0x0: {
+// CHECK-NEXT:    OPC_Scope, 8,                   //  5:  {
+// CHECK-NEXT:    OPC_CheckField, 6, 6, 0,        //  7:   check Inst[11:6] == 0x0
+// CHECK-NEXT:    OPC_Decode, {{[0-9, ]+}}, 0,    // 11:   decode to NOP using decoder 0
+// CHECK-NEXT:                                    // 11:  }
+// CHECK-NEXT:    OPC_TryDecode, {{[0-9, ]+}}, 1, // 15:  try decode to SHIFT0 using decoder 1
+// CHECK-NEXT:                                    // 15: }
+// CHECK-NEXT:    1, 14,                          // 19: case 0x1: {
+// CHECK-NEXT:    OPC_Scope, 8,                   // 21:  {
+// CHECK-NEXT:    OPC_CheckField, 6, 6, 0,        // 23:   check Inst[11:6] == 0x0
+// CHECK-NEXT:    OPC_Decode, {{[0-9, ]+}}, 0,    // 27:   decode to anonymous_10323:NOP using decoder 0
+// CHECK-NEXT:                                    // 27:  }
+// CHECK-NEXT:    OPC_TryDecode, {{[0-9, ]+}}, 1, // 31:  try decode to SHIFT1 using decoder 1
+// CHECK-NEXT:                                    // 31: }
+// CHECK-NEXT:    2, 14,                          // 35: case 0x2: {
+// CHECK-NEXT:    OPC_Scope, 8,                   // 37:  {
+// CHECK-NEXT:    OPC_CheckField, 6, 6, 0,        // 39:   check Inst[11:6] == 0x0
+// CHECK-NEXT:    OPC_Decode, {{[0-9, ]+}}, 0,    // 43:   decode to anonymous_10324:NOP using decoder 0
+// CHECK-NEXT:                                    // 43:  }
+// CHECK-NEXT:    OPC_TryDecode, {{[0-9, ]+}}, 1, // 47:  try decode to SHIFT2 using decoder 1
+// CHECK-NEXT:                                    // 47: }
+// CHECK-NEXT:    3, 0,                           // 51: case 0x3: {
+// CHECK-NEXT:    OPC_Scope, 8,                   // 53:  {
+// CHECK-NEXT:    OPC_CheckField, 6, 6, 0,        // 55:   check Inst[11:6] == 0x0
+// CHECK-NEXT:    OPC_Decode, {{[0-9, ]+}}, 0,    // 59:   decode to anonymous_10325:NOP using decoder 0
+// CHECK-NEXT:                                    // 59:  }
+// CHECK-NEXT:    OPC_TryDecode, {{[0-9, ]+}}, 1, // 63:  try decode to SHIFT3 using decoder 1
+// CHECK-NEXT:                                    // 63: }
+// CHECK-NEXT:                                    // 63: } // switch Inst[15:12]
+// CHECK-NEXT:  };
 
 class SHIFT<bits<2> opc> : I<(outs), (ins ShAmtOp:$shamt)>, EncSHIFT<opc>;
 def SHIFT0 : SHIFT<0>;
diff --git a/llvm/test/TableGen/DecoderEmitter/big-filter.td b/llvm/test/TableGen/DecoderEmitter/big-filter.td
index 87aa7f814c3f3..fa516ad5665a5 100644
--- a/llvm/test/TableGen/DecoderEmitter/big-filter.td
+++ b/llvm/test/TableGen/DecoderEmitter/big-filter.td
@@ -11,14 +11,18 @@ class I : Instruction {
 
 // Check that a 64-bit filter with all bits set does not confuse DecoderEmitter.
 //
-// CHECK-LABEL: static const uint8_t DecoderTable128[34] = {
-// CHECK-NEXT:  /* 0 */  OPC_ExtractField, 0, 64,        // Field = Inst{63-0}
-// CHECK-NEXT:  /* 3 */  OPC_FilterValueOrSkip, 1, 8, 0, // if Field != 0x1 skip to 15
-// CHECK-NEXT:  /* 7 */  OPC_CheckField, 127, 1, 1,      // if Inst{127} != 0x1
-// CHECK-NEXT:  /* 11 */ OPC_Decode, {{[0-9]+}}, 2, 0,   // Opcode: I2, DecodeIdx: 0
-// CHECK-NEXT:  /* 15 */ OPC_FilterValue, 255, 255, 255, 255, 255, 255, 255, 255, 255, 1, // if Field != 0xffffffffffffffff
-// CHECK-NEXT:  /* 26 */ OPC_CheckField, 127, 1, 0,      // if Inst{127} != 0x0
-// CHECK-NEXT:  /* 30 */ OPC_Decode, {{[0-9]+}}, 2, 0,   // Opcode: I1, DecodeIdx: 0
+// CHECK-LABEL: static const uint8_t DecoderTable128[32] = {
+// CHECK-NEXT:    OPC_SwitchField, 0, 64,      //  0: switch Inst[63:0] {
+// CHECK-NEXT:    1, 8,                        //  3: case 0x1: {
+// CHECK-NEXT:    OPC_CheckField, 127, 1, 1,   //  5:  check Inst[127:127] == 0x1
+// CHECK-NEXT:    OPC_Decode, {{[0-9, ]+}}, 0, //  9:  decode to I2 using decoder 0
+// CHECK-NEXT:                                 //  9: }
+// CHECK-NEXT:    255, 255, 255, 255, 255, 255, 255, 255, 255, 1, 0,
+// CHECK-NEXT:                                 // 13: case 0xffffffffffffffff: {
+// CHECK-NEXT:    OPC_CheckField, 127, 1, 0,   // 24:  check Inst[127:127] == 0x0
+// CHECK-NEXT:    OPC_Decode, {{[0-9, ]+}}, 0, // 28:  decode to I1 using decoder 0
+// CHECK-NEXT:                                 // 28: }
+// CHECK-NEXT:                                 // 28: } // switch Inst[63:0]
 // CHECK-NEXT:  };
 
 def I1 : I {
diff --git a/llvm/test/TableGen/DecoderEmitter/trydecode-emission.td b/llvm/test/TableGen/DecoderEmitter/trydecode-emission.td
index cdb1e327ad07d..8c0d715a87d27 100644
--- a/llvm/test/TableGen/DecoderEmitter/trydecode-emission.td
+++ b/llvm/test/TableGen/DecoderEmitter/trydecode-emission.td
@@ -1,5 +1,4 @@
 // RUN: llvm-tblgen -gen-disassembler -I %p/../../../include %s | FileCheck %s
-// RUN: llvm-tblgen -gen-disassembler --large-decoder-table -I %p/../../../include %s | FileCheck %s --check-prefix=CHECK-LARGE
 
 // Check that if decoding of an instruction fails and the instruction does not
 // have a complete decoder method that can determine if the bitpattern is valid
@@ -34,29 +33,13 @@ def InstB : TestInstruction {
   let hasCompleteDecoder = 0;
 }
 
-// CHECK:      /* 0 */       OPC_CheckField, 4, 4, 0,                   // if Inst{7-4} != 0x0
-// CHECK-NEXT: /* 4 */       OPC_Scope, 8, 0,                           // end scope at 15
-// CHECK-NEXT: /* 7 */       OPC_CheckField, 2, 2, 0,                   // if Inst{3-2} != 0x0
-// CHECK-NEXT: /* 11 */      OPC_TryDecode, {{[0-9]+}}, {{[0-9]+}}, 0,  // Opcode: InstB, DecodeIdx: 0
-// CHECK-NEXT: /* 15 */      OPC_Decode, {{[0-9]+}}, {{[0-9]+}}, 1,     // Opcode: InstA, DecodeIdx: 1
+// CHECK-LABEL: static const uint8_t DecoderTable8[18] = {
+// CHECK-NEXT:    OPC_CheckField, 4, 4, 0,        //  0: check Inst[7:4] == 0x0
+// CHECK-NEXT:    OPC_Scope, 8,                   //  4: {
+// CHECK-NEXT:    OPC_CheckField, 2, 2, 0,        //  6:  check Inst[3:2] == 0x0
+// CHECK-NEXT:    OPC_TryDecode, {{[0-9, ]+}}, 0, // 10:  try decode to InstB using decoder 0
+// CHECK-NEXT:                                    // 10: }
+// CHECK-NEXT:    OPC_Decode, {{[0-9, ]+}}, 1,    // 14: decode to InstA using decoder 1
 // CHECK-NEXT: };
 
 // CHECK: if (!Check(S, DecodeInstB(MI, insn, Address, Decoder))) { DecodeComplete = false; return MCDisassembler::Fail; }
-
-// CHECK:       unsigned NumToSkip = *Ptr++;
-// CHECK-NEXT:  NumToSkip |= (*Ptr++) << 8;
-// CHECK-NEXT:  return NumToSkip;
-
-// CHECK-LARGE:      /* 0 */       OPC_CheckField, 4, 4, 0,                  // if Inst{7-4} != 0x0
-// CHECK-LARGE-NEXT: /* 4 */       OPC_Scope, 8, 0, 0,                       // end scope at 16
-// CHECK-LARGE-NEXT: /* 8 */       OPC_CheckField, 2, 2, 0,                  // if Inst{3-2} != 0x0
-// CHECK-LARGE-NEXT: /* 12 */      OPC_TryDecode, {{[0-9]+}}, {{[0-9]+}}, 0, // Opcode: InstB, DecodeIdx: 0
-// CHECK-LARGE-NEXT: /* 16 */      OPC_Decode, {{[0-9]+}}, {{[0-9]+}}, 1,    // Opcode: InstA, DecodeIdx: 1
-// CHECK-LARGE-NEXT: };
-
-// CHECK-LARGE: if (!Check(S, DecodeInstB(MI, insn, Address, Decoder))) { DecodeComplete = false; return MCDisassembler::Fail; }
-
-// CHECK-LARGE:       unsigned NumToSkip = *Ptr++;
-// CHECK-LARGE-NEXT:  NumToSkip |= (*Ptr++) << 8;
-// CHECK-LARGE-NEXT:  NumToSkip |= (*Ptr++) << 16;
-// CHECK-LARGE-NEXT:  return NumToSkip;
diff --git a/llvm/test/TableGen/DecoderEmitter/trydecode-emission2.td b/llvm/test/TableGen/DecoderEmitter/trydecode-emission2.td
index 35657ff35c86f..8614639cb24da 100644
--- a/llvm/test/TableGen/DecoderEmitter/trydecode-emission2.td
+++ b/llvm/test/TableGen/DecoderEmitter/trydecode-emission2.td
@@ -1,5 +1,4 @@
 // RUN: llvm-tblgen -gen-disassembler -I %p/../../../include %s | FileCheck %s
-// RUN: llvm-tblgen -gen-disassembler --large-decoder-table -I %p/../../../include %s | FileCheck %s --check-prefix=CHECK-LARGE
 
 include "llvm/Target/Target.td"
 
@@ -31,25 +30,16 @@ def InstB : TestInstruction {
   let hasCompleteDecoder = 0;
 }
 
-// CHECK:      /* 0 */       OPC_CheckField, 2, 1, 0,
-// CHECK-NEXT: /* 4 */       OPC_CheckField, 5, 3, 0,
-// CHECK-NEXT: /* 8 */       OPC_Scope, 8, 0, // end scope at 19
-// CHECK-NEXT: /* 11 */      OPC_CheckField, 0, 2, 3,
-// CHECK-NEXT: /* 15 */      OPC_TryDecode, {{[0-9]+}}, {{[0-9]+}}, 0,
-// CHECK-NEXT: /* 19 */      OPC_CheckField, 3, 2, 0,
-// CHECK-NEXT: /* 23 */      OPC_TryDecode, {{[0-9]+}}, {{[0-9]+}}, 1,
+// CHECK-LABEL: static const uint8_t DecoderTable8[26] = {
+// CHECK-NEXT:    OPC_CheckField, 2, 1, 0,        //  0: check Inst[2:2] == 0x0
+// CHECK-NEXT:    OPC_CheckField, 5, 3, 0,        //  4: check Inst[7:5] == 0x0
+// CHECK-NEXT:    OPC_Scope, 8,                   //  8: {
+// CHECK-NEXT:    OPC_CheckField, 0, 2, 3,        // 10:  check Inst[1:0] == 0x3
+// CHECK-NEXT:    OPC_TryDecode, {{[0-9, ]+}}, 0, // 14:  try decode to InstB using decoder 0
+// CHECK-NEXT:                                    // 14: }
+// CHECK-NEXT:    OPC_CheckField, 3, 2, 0,        // 18: check Inst[4:3] == 0x0
+// CHECK-NEXT:    OPC_TryDecode, {{[0-9, ]+}}, 1, // 22: try decode to InstA using decoder 1
+// CHECK-NEXT: };
 
 // CHECK: if (!Check(S, DecodeInstB(MI, insn, Address, Decoder))) { DecodeComplete = false; return MCDisassembler::Fail; }
 // CHECK: if (!Check(S, DecodeInstA(MI, insn, Address, Decoder))) { DecodeComplete = false; return MCDisassembler::Fail; }
-
-// CHECK-LARGE:      /* 0 */       OPC_CheckField, 2, 1, 0,
-// CHECK-LARGE-NEXT: /* 4 */       OPC_CheckField, 5, 3, 0,
-// CHECK-LARGE-N...
[truncated]

llvm/utils/TableGen/DecoderEmitter.cpp

This reduces the vertical size of the output and should make it clearer. Drop "check" when printing SoftFail because this opcode cannot fail.

…ntly

We only need encoding name and instruction opcode.

jurahul · 2025-09-19T00:38:05Z

Overall LGTM. @topperc or maybe @lenary if one of you can review as well, that would be great.

jurahul · 2025-09-19T00:38:39Z

@s-barannikov can you fille in the description as well?

s-barannikov · 2025-09-19T02:57:24Z

@jurahul Filled in the description, PTAL.

jurahul · 2025-09-19T12:01:48Z

Description looks good to me as well.

lenary

LGTM

topperc

LGTM

llvm-ci · 2025-09-20T04:51:32Z

LLVM Buildbot has detected a new failure on builder lldb-arm-ubuntu running on linaro-lldb-arm-ubuntu while building llvm at step 6 "test".

Full details are available at: https://lab.llvm.org/buildbot/#/builders/18/builds/21332

Here is the relevant piece of the build log for the reference

Step 6 (test) failure: build (failure)
...
PASS: lldb-unit :: Core/./LLDBCoreTests/98/470 (2445 of 3673)
PASS: lldb-unit :: Core/./LLDBCoreTests/99/470 (2446 of 3673)
PASS: lldb-unit :: DAP/./DAPTests/1/76 (2447 of 3673)
PASS: lldb-unit :: DAP/./DAPTests/0/76 (2448 of 3673)
PASS: lldb-unit :: DAP/./DAPTests/12/76 (2449 of 3673)
PASS: lldb-unit :: DAP/./DAPTests/13/76 (2450 of 3673)
PASS: lldb-unit :: DAP/./DAPTests/14/76 (2451 of 3673)
PASS: lldb-unit :: DAP/./DAPTests/11/76 (2452 of 3673)
PASS: lldb-unit :: DAP/./DAPTests/15/76 (2453 of 3673)
PASS: lldb-unit :: DAP/./DAPTests/16/76 (2454 of 3673)
FAIL: lldb-unit :: DAP/./DAPTests/10/76 (2455 of 3673)
******************** TEST 'lldb-unit :: DAP/./DAPTests/10/76' FAILED ********************
Script(shard):
--
GTEST_OUTPUT=json:/home/tcwg-buildbot/worker/lldb-arm-ubuntu/build/tools/lldb/unittests/DAP/./DAPTests-lldb-unit-304603-10-76.json GTEST_SHUFFLE=0 GTEST_TOTAL_SHARDS=76 GTEST_SHARD_INDEX=10 /home/tcwg-buildbot/worker/lldb-arm-ubuntu/build/tools/lldb/unittests/DAP/./DAPTests
--

Script:
--
/home/tcwg-buildbot/worker/lldb-arm-ubuntu/build/tools/lldb/unittests/DAP/./DAPTests --gtest_filter=DisconnectRequestHandlerTest.DisconnectTriggersTerminateCommands
--
../llvm-project/lldb/unittests/DAP/Handler/DisconnectTest.cpp:51: Failure
Actual function call count doesn't match EXPECT_CALL(client, Received(Output("1\n")))...
         Expected: to be called once
           Actual: never called - unsatisfied and active

../llvm-project/lldb/unittests/DAP/Handler/DisconnectTest.cpp:52: Failure
Actual function call count doesn't match EXPECT_CALL(client, Received(Output("2\n")))...
         Expected: to be called twice
           Actual: called once - unsatisfied and active


../llvm-project/lldb/unittests/DAP/Handler/DisconnectTest.cpp:51
Actual function call count doesn't match EXPECT_CALL(client, Received(Output("1\n")))...
         Expected: to be called once
           Actual: never called - unsatisfied and active

../llvm-project/lldb/unittests/DAP/Handler/DisconnectTest.cpp:52
Actual function call count doesn't match EXPECT_CALL(client, Received(Output("2\n")))...
         Expected: to be called twice
           Actual: called once - unsatisfied and active



********************
PASS: lldb-unit :: DAP/./DAPTests/19/76 (2456 of 3673)
PASS: lldb-unit :: DAP/./DAPTests/18/76 (2457 of 3673)
PASS: lldb-unit :: DAP/./DAPTests/17/76 (2458 of 3673)
PASS: lldb-unit :: DAP/./DAPTests/20/76 (2459 of 3673)

s-barannikov force-pushed the tablegen/decoder/tree branch 3 times, most recently from a215484 to 37fae85 Compare September 5, 2025 18:13

s-barannikov force-pushed the tablegen/decoder/tree branch 3 times, most recently from 52d67a4 to 5552436 Compare September 13, 2025 10:41

s-barannikov force-pushed the tablegen/decoder/tree branch 5 times, most recently from bd0a806 to aa94a6b Compare September 16, 2025 03:55

s-barannikov mentioned this pull request Sep 16, 2025

[TableGen][DecoderEmitter] Replace opcode mask with booleans (NFC) #159113

Merged

s-barannikov added a commit that referenced this pull request Sep 16, 2025

[TableGen][DecoderEmitter] Replace opcode mask with booleans (NFC) (#…

0864965

…159113) Extracted from #155889, which removes inclusion of `MCDecoderOps.h`.

s-barannikov force-pushed the tablegen/decoder/tree branch 5 times, most recently from af90f9e to 71cbd51 Compare September 16, 2025 17:54

s-barannikov requested review from jurahul and topperc September 16, 2025 18:00

s-barannikov force-pushed the tablegen/decoder/tree branch from eb59743 to 71cbd51 Compare September 16, 2025 18:11

s-barannikov marked this pull request as ready for review September 16, 2025 18:12

llvmbot added tablegen llvm:mc Machine (object) code labels Sep 16, 2025

s-barannikov force-pushed the tablegen/decoder/tree branch from 71cbd51 to 3019450 Compare September 16, 2025 20:22

jurahul reviewed Sep 16, 2025

View reviewed changes

llvm/utils/TableGen/DecoderEmitter.cpp Outdated Show resolved Hide resolved

Print CheckAny as a sequence of try-else, fix SoftFail output

9905b7e

This reduces the vertical size of the output and should make it clearer. Drop "check" when printing SoftFail because this opcode cannot fail.

s-barannikov force-pushed the tablegen/decoder/tree branch from a917e65 to b3daa16 Compare September 18, 2025 16:32

s-barannikov added 4 commits September 18, 2025 20:09

Print single-bit bit range in more compact form

1892e3f

Rename CommentIndex -> LineStartIndex, comment index-related members

71a404e

Add a TODO comment to emitCheckAllNode

bc91cb2

Explain why the last child of CheckAny/SwitchField is emitted differe…

2f68c1a

…ntly

s-barannikov force-pushed the tablegen/decoder/tree branch from b3daa16 to 2f68c1a Compare September 18, 2025 17:19

s-barannikov added 3 commits September 18, 2025 21:50

Move predicate/decoder unification to tree construction stage

38c5af0

Really add a comment

d7b6fcf

Remove InstructionEncoding from DecodeNode

3ed446e

We only need encoding name and instruction opcode.

Document tree nodes

18fafdc

jurahul requested a review from lenary September 19, 2025 12:02

Merge branch 'main' into tablegen/decoder/tree

a574eb5

lenary approved these changes Sep 19, 2025

View reviewed changes

jurahul approved these changes Sep 19, 2025

View reviewed changes

topperc approved these changes Sep 19, 2025

View reviewed changes

Merge branch 'main' into tablegen/decoder/tree

fdac726

s-barannikov enabled auto-merge (squash) September 20, 2025 01:55

s-barannikov merged commit 60bdf09 into llvm:main Sep 20, 2025
9 checks passed

s-barannikov deleted the tablegen/decoder/tree branch September 20, 2025 01:58

[TableGen][DecoderEmitter] Rework table construction/emission #155889

[TableGen][DecoderEmitter] Rework table construction/emission #155889

Uh oh!

Conversation

s-barannikov commented Aug 28, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Current state

This PR

Results

Uh oh!

jurahul commented Aug 28, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

github-actions bot commented Sep 13, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

s-barannikov commented Sep 16, 2025

Uh oh!

llvmbot commented Sep 16, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

jurahul commented Sep 19, 2025

Uh oh!

jurahul commented Sep 19, 2025

Uh oh!

s-barannikov commented Sep 19, 2025

Uh oh!

jurahul commented Sep 19, 2025

Uh oh!

lenary left a comment

Choose a reason for hiding this comment

Uh oh!

topperc left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

llvm-ci commented Sep 20, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

6 participants

s-barannikov commented Aug 28, 2025 •

edited

Loading

jurahul commented Aug 28, 2025 •

edited

Loading

github-actions bot commented Sep 13, 2025 •

edited

Loading

llvmbot commented Sep 16, 2025 •

edited

Loading